Training audio events detectors with a sound effects corpus
نویسندگان
چکیده
This paper describes the work done in the framework of the VIDIVIDEO European project in terms of audio event detection. Our first experiments concerned the detection of nonvoice sounds, such as birds, machines, traffic, water and steps. Given the unavailability of a corpus labelled in terms of audio events, we used a relatively small sound effect corpus for training. Our initial experiments with one-against-all SVM classifiers for these 5 classes showed us the feasibility of using this type of data for training, thus avoiding the extremely morose task of manual labelling of a very high number of audio events. Preliminary integration experiments are quite promising.
منابع مشابه
Detecting audio events for semantic video search
This paper describes our work on audio event detection, one of our tasks in the European project VIDIVIDEO. Preliminary experiments with a small corpus of sound effects have shown the potential of this type of corpus for training purposes. This paper describes our experiments with SVM classifiers, and different features, using a 290-hour corpus of sound effects, which allowed us to build detect...
متن کاملNon - Speech Acoustic Event Detection Using
Non-speech acoustic event detection (AED) aims to recognize events that are relevant to human activities associated with audio information. Much previous research has been focused on restricted highlight events, and highly relied on ad-hoc detectors for these events. This thesis focuses on using multimodal data in order to make non-speech acoustic event detection and classification tasks more r...
متن کاملAudio Database in Support of Potentiel Threat and Crisis Situation Management
This paper describes a corpus consisting of audio data for automatic space monitoring based solely on the perceived acoustic information. The particular database is created as part of a project aiming at the detection of abnormal events, which lead to life-threatening situations or property damage. The audio corpus is composed of vocal reactions and environmental sounds that are usually encount...
متن کاملA Transfer Learning Based Feature Extractor for Polyphonic Sound Event Detection Using Connectionist Temporal Classification
Sound event detection is the task of detecting the type, onset time, and offset time of sound events in audio streams. The mainstream solution is recurrent neural networks (RNNs), which usually predict the probability of each sound event at every time step. Connectionist temporal classification (CTC) has been applied in order to relax the need for exact annotations of onset and offset times; th...
متن کاملInvestigation of Noise Levels in Sugar Factory of Debal Khozaei Agro-industry Complex
Introduction and purpose: Noise pollution can exert negative effects on mental health. In this regard, the present study aimed to evaluate the noise levels in sugar factory of Debal Khozaei agro-industry complex in 2018. Methods: For the current study, sound and audio parameters were measured using a sound meter. These audio parameters included sound pressure level and minimum and maximum valu...
متن کامل